Wordnets: State of the Art and Perspectives. Case Study: the Romanian Wordnet

نویسنده

  • Verginica Barbu Mititelu
چکیده

During a quarter of a century of existence and in spite of much criticism, wordnets have thoroughly proved their appropriateness as repositories of linguistic knowledge and their usefulness in various applications. In this paper we present the methodology of creating the Romanian wordnet (RoWN), with special emphasis on the strategies adopted during ten years of ceaseless implementation and which highlight the efforts invested, the way we dealt with the alignment of the RoWN (previously aligned to PWN 2.0) to the PWN 3.0, as well as the future work we envisage for enriching and extending this resource. 1 Generalities on wordnets Language is a system of signs. This structuralist perspective on language serves extremely well the description of natural languages by both theoretical linguists and specialists in the formal representation of language. Notions like paradigm (i.e. class of similar elements), syntagm (i.e. a linguistic environment in which the elements of a paradigm can occur), value (the distinguishable functional role of an element in a syntagm) are modeled to serve the formal representation of language as a whole. Among the different knowledge representation formalisms, we focus here on wordnets, a special kind of semantic networks. While semantic networks have words in nodes and the arcs are semantic relations, wordnets are definitely more than that: they are:  monolingual dictionaries: they contain words with definitions for each of their senses;  multilingual dictionaries: via the InterLingual Index, ILI, access from one language-specific network to all the others is facilitated; thus, it is possible to compare the organization of the lexical material of various languages, to find examples supporting the thesis of semantic specificity of languages, to introduce the multilingual dimension in various applications relying on wordnets;  thesauri: lexical information is organized in terms of word meanings, not word forms;  lexical ontologies: wordnets contain concepts lexicalizations from various domains and the relations between these concepts lexicalizations. There have been more projects enriching wordnet with ontological information: WordNet Domains (Pianta et al. 2002), SUMO (Niles and Pease 2003).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Methodology and Associated Tools for Building Interlingual Wordnets

The paper reports on the ongoing effort towards the development of a Romanian wordnet aligned to the Princeton WordNet. The first part generically describes the methodology we used as well the language resources that supported our approach. In the second part we will describe the tools that implemented this methodology and a quantitative account for the content of the Romanian wordnet at the ti...

متن کامل

News about the Romanian Wordnet

There are more than 60 wordnets worldwide; the Romanian wordnet is among those that are maintained and further developed. Begun within the BalkaNet project and further enriched in various (application oriented) projects, it was used in word sense disambiguation, machine translation and question answering with promising results. We present here the latest qualitative and quantitative improvement...

متن کامل

Word Sense Disambiguation as a Wordnets' Validation Method in Balkanet

BalkaNet is a European project which aims at the development of monolingual wordnets for five languages in the Balkans area (Bulgarian, Greek, Romanian Serbia, and Turkish) and at improvement of the Czech wordnet developed in the EuroWordNet project. The wordnets are aligned to the Princeton Wordnet, according to the principles established by the EuroWordNet consortium. One of the main concerns...

متن کامل

Adding Morpho-semantic Relations to the Romanian Wordnet

Keeping pace with other wordnets development, we present the challenges raised by the Romanian derivational system and our methodology for identifying derived words and their stems in the Romanian Wordnet. To attain this aim we rely only on the list of literals in the wordnet and on a list of Romanian affixes; the automatically obtained pairs require automatic and manual validation, based on a ...

متن کامل

Romanian WordNet: New Developments and Applications

Among the existing ontologies, the multilingual lexical ontologies have a special status. Structured in a similar way to standard ontologies, the lexical ones are distinguished by the fundamental requirement that each conceptualized entity is lexicalized by one or more synonymous words (a synset) of the natural language vocabulary. Multilingually aligned wordnets, such as EuroWordNet or BalkaNe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011